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I > . Introduction 

With the fantastic growth in computerized data 
processing and management, there arises a great need for 
improved techniques in cataloguing of machine readable 
data bases. The purpose of this report is to define a 
bystem by which computerized data bases may be catalogued 
for easy reference and availability. Developed from a 
computer scientist's viewpoint, emphasis was placed on 
identification of what information should be included in 
or excluded from such a catalogue. A glossary is also 
included to provide a standard reference base. 

The objective of the proposed cataloguing system 
is to provide the potential user with infprmation which 
would help him decide whether or not he would. want to use 
a particular data base. 

It is hoped that this report may r^^rye as a first 
step in development of cataloguing procedures for use by 
libraries and other agencies • 
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II > Included Information 

The proposed cataloguing system should include 
eight- specific item£(. 



1) 


Owner Name and; Location. 


2) 


Author 


3) 


File Name ' • 


4) 


Subject 


5) 


Date 


6) 


Record Count 


7) 


Fields per Record 


8) 


Security 



1) Owner. Name and Location 

Owner Name and Location is, of course, a necessity 
to a potential user since the user must be able to con- 
tact the owner, whether for further information regarding 
the data base, or for actual arrangements for usage » 

9 • 

2) Author 

As in the case of published' material , it is neces*- 
sary for an individual to receive credit for his work. 
However, inclusion of an author in a catalogue of data 
bases is even more importcint as it would in many cases 
provide a potential user with the name of a specific per- 
son to contact regarding use of the data base. In instan 
ces where the creation of the data base had no single 
author, this field may indicate a corporate authorship* or 
if 'unknowxl,\.oould be left blank in the catalogue • 
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3) File Name 

The file name serves to identify an individual data 
base. This field is actually the key. field in the cata- " 
logue as each particular data base entry would have a dif- 
ferent file name within any one owner's library. 

4) Subject 

The subject field allows for a few sentences to des- 
cribe tlie 4a ta content of the file. It is this field that 
a user would be interested in and a keyword or subject sys- 
tem should; be established here to allow more rapid location 
of desired! information. Even if a keyword system were 
employed, it would still be advantageous to include a des- 
cription of the contents, use, significance, oi: other per- 
tinent infbrmation about the data base whi^h would serve 
to enlighten the potential user as to whether he would want 
to use the file or not« 

5) Date ' 

The ;date category is broken down into smaller sub- 
fields depiending on whether the information is of a per- 
iodic nature or not. In the case of time-related infer- 
mation there are four sub-fields*. 

. i / , a) Time Span 
; ; ' b) Volume Period 
^ . . ' c) Re.laase Date ' 
d) Retet^tion 
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a) Time Span - This sub-field indicates the 



. within the data base, 

b) Volrnne Period - This section indicates the 

[ time interval between new issues or volumes 
of the file, 

c) Release Date - The release date serves to in- 

dicate the point in time that a new volume of 
a series would be made available for use. 

d) Retention - Unlike published material, machine 
V - , readable data bases are normally obsoleted 

. after a certain length of time. It is important 

to include this in a catalogue to provide the 
, user with a definite date when he can no longer 
, obtain information. Archival policy may over- 
ride this date. 



In the case of material which 18 not time-related, . 



a) Creation Date - The creation date is the date 



the file was originally released In machine- 
readable form. 



exact time period covered by the information 



only two . sub-fields are required, 

a) creation Date 
! b) Retention 



b) Retention; ^ (same as above ), 



I 

I 
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6) Record Count 

The record count serves to describe the size of the 
data base, much as the. number of pages describes the size 
of a book;. It would certainly help a user in ' determining 
the amount of information available to him. 

7) Fields per Record 

The number- of fields within a record describes the 
size of each record. A field is a unique piece of infor- I 
mation within the record, thus, this measure would help 
the user in determining the complexity or extent of de- 
tail of the information within the data base. It may 
specify that the records contain a variable nximber of 
fields / with indicated typical and laaximxim counts, Also^ 
there are cases wherein this field has no significant 
xaeaning, * 

•8) Security 

It is anticipated that most, if not all, data bases 
could and would be catalogued in libraries, and since many 
data bases would be considered "classified" by their own- 
ers, it would be useful to have a security description. 
This would prevent interested users from many needless 
inquiries. I into the availability of classified data and 
still allow maximum operation of the cataloguing system. 
Possible catagories could be as simple as "public", "prin^ 
vate", and "semi-public". Public files would be open to 
. all interested users, private files being completely closed^ 
and semi-public indicating that some information would, be 
available, leaving it to the user to contact the supplier 
for more details. 
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111. Excluded Information 

i 

The proposed cataloguing system would exclude the 
following four items* 

^ 1) Medium and Mode Description 

2) Field Description / ' 

3) Extent of Usage 

4) Miscellaneous Dates 

Each was, given careful consideration before being clropped, 
and rejected because it was felt that the amount of use- 
ful information given was not worth the extra space; and 
time quired to include them in the catalogue. 

1) Medium and Mode Description 

With the large nvimber of storage devices on the mar- 
ket today and- the many varied code formats , it is virtually 
impossible to include all necessary information in a cata- 
logue • Also, if a user was definitely interested in using 
a data base, he would perhaps need it converted to a form 
acceptable to his equipment. It is anticipated that most 
data would need to be converted for each different user. . 
Therefore, it is not really worthwhile to include detailed 
hardware description or recordirxg formats. 
I 

2) Fieldi Description 

Keeping in mind the need for a brief catalogue/ it 
is hardly practical to priesent a detailed field descrip- 
tion. Many data bases will have scores of fields / and to 
include a description or even a name for each would require 
much mora room than a catalogue could allow« 
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3) Extent of Usage 

It is often helpful to know the significance and 
scope of use of a data base simply ,for a better understan- 
ding of the data base itself. However, it would be dxf- 
f icult to lay down rigid, guidelines for description of 

such nebulous terms as significance and scope. 

any information along this line could be included in the , 

subject area if so-desired. 

4) Miscellaneous Dates 

Many. key dates were considered before finally ar- 
xiving at those included in the catalogue. Dates omnitted 
included the following: 

. I .a) Time of data collection 1 
■ i b) Time of assembling data into 
machine-readable format 
c) Date of output into current 
logical foinmat 
' d) Date of output into cuirent 
physical format 
: e) Dates of supplementary files 

I 

The preceding dat^s are informative but fairly 
useless ijn providing information concerning- the contents 
of the <»9tiai base. 



1 
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IV. Conclusion 

Dfcsi^n of the cataloguiiig system was influenced by 
the need for maximum ease of usage and data description 
relevant t6 the needs of most users. Thus the most impor- 
tant goal of a cataloguing system should be to ptovide 
information on a maximum number of data sources, and pro- 
vide the potential user with an adequate description so that 
he may then decide if he is interested in using a parti- 
cular data base. This approach also reduces overhead in- 
volved in more complex cataloguing systems and allows for 
uniform cataloguing of virtually all types of data bases. . 
Finer deta,ils of the data file and its availability would 
then be left for the atapplier and user to arrange them- 
. selves . I 

Examples of. the. proposed system follow in section V. 



i 
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y. Examples 



Owner : Office of Fiscal Affairs^ Oregon State University^ 

Corvallis, Ore* 
iVuthor: O.S.U. Computer Center 
Name: UPDATE II 

Subject : Personnel data for non-classified employees of 
Oregon State University^ including information 
on appointments^ degree^, PTE^ Tenure, Md 
other <.elated information* 
Date : Time Span = 196v6 to date 

Volume Period « 1 academic year 
Release Date « July 1 
Retention = 10 years ' ' ^ 

Record Count : 3200 records . ^ 
Fie Ids/Record : 48 
Security : Semir-public 
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Owner: Library, Oregon State University, Corvallis, Ore. 
Authors: Jennings, Michael A.; Spigai, Frances; Mahan, Thomas 
Name: LOLITA 

Subject ; Book ordering ana fund accounting syatem handling 
purchases, receipts, gifts, etc. and all necessary 
accounting eind report printing. 

Date: Creation Date « March 1970 (daily changes) 
Retention «» Indefinite 

Record Count ; 6000 

Fields /Record : approx. 31 

Security » Semi-public 



VI » Glossary 



Field 



Character - Primary element of information storage. ' For 
instance, letters of the alphabet, numbers, 
and special characters. An important means 
of determining data storage or transmission 
' capacity . 

Data Base - Reservoir of data, collection of facts which 
may or may not be related structurally, but 
must be available to the facilities of the 
system where "used. Thus, a machine-readable 
data base must be directly available to the 
computer or data-processing equipirent. 

- A grouping of one or more characters which 
is treated as a whole for the purpose of 
representing a particular catagory of data 
within a record. 

- A collection of related records treated as a 
unit. Any collection of informational items 

i similar in purpose, form, and content, and 
structurally related. 

The equipment or material within which data 
is stored, e.g., tape, cards, disk, etc. 

- The method of structuring or coding informa- 
tion on a file, e.g., binary, BCD, ASCII, etc. 

- A group of one or more consecutive fields on 
a related subject, a group of related facts 
or fields of information treated as a unit; 

' a unique data entry. 

Record Count - The number of records within a file, number 
= of unique data entries. 



File' 



Medium 



Mode 



Record 



Release Date - Date on which a dated file or data-base is 
released or made available for general use, 
following the closing of one volxame time period. 

Retention- Time period a data file is held before being 
disposed of or made available for change. 

Seciurity r Refers to the availability of data for general 
use. Having to do with the classification of 
\ information ... private 8emi-!-public , |>ublic . 
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